Learning in concave games with imperfect information

نویسنده

  • Panayotis Mertikopoulos
چکیده

This paper examines the convergence properties of a class of learning schemes for concave N -person games – that is, games with convex action spaces and individually concave payoff functions. Specifically, we focus on a family of learning methods where players adjust their actions by taking small steps along their individual payoff gradients and then “mirror” the output back to their feasible action spaces. Assuming players only have access to gradient information that is accurate up to a zero-mean error with bounded variance, we show that when the process converges, its limit is a Nash equilibrium. We also introduce an equilibrium stability notion which we call variational stability (VS), and we show that stable equilibria are locally attracting with high probability whereas globally stable states are globally attracting with probability 1. Additionally, in finite games, we find that dominated strategies become extinct, strict equilibria are locally attracting with high probability, and the long-term average of the process converges to equilibrium in 2-player zero-sum games. Finally, we examine the scheme’s convergence speed and we show that if the game admits a strict equilibrium and the players’ mirror maps are surjective, then, with high probability, the process converges to equilibrium in a finite number of steps, no matter the level of uncertainty.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Unified View of Large-Scale Zero-Sum Equilibrium Computation

The task of computing approximate Nash equilibria in large zero-sum extensive-form games has received a tremendous amount of attention due mainly to the Annual Computer Poker Competition. Immediately after its inception, two competing and seemingly different approaches emerged—one an application of noregret online learning, the other a sophisticated gradient method applied to a convex-concave s...

متن کامل

Equilibrium Selection in Evolutionary Games with Imperfect Monitoring

In this paper we analyze players’ long-run behavior in evolutionary coordination games with imperfect monitoring in a large population. Players can observe signals corresponding to other players’ unseen actions and use the proposed simple or maximum likelihood estimation algorithm to extract information from the signals. In the simple learning process we find conditions for the risk-dominant an...

متن کامل

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable endto-end approach to learning approximate Nash equilibria without any prior knowledge. Our method combines fictitious sel...

متن کامل

Enhancing Artificial Intelligence on a Real Mobile Game

Mobile games represent a killer application that is attracting millions of subscribers worldwide. One of the aspects crucial to the commercial success of a game is ensuring an appropriately challenging artificial intelligence (AI) algorithm against which to play. However, creating this component is particularly complex as classic search AI algorithms cannot be employed by limited devices such a...

متن کامل

On the Power of Imperfect Information

We present a polynomial-time reduction from parity games with imperfect information to safety games with imperfect information. Similar reductions for games with perfect information typically increase the game size exponentially. Our construction avoids such a blow-up by using imperfect information to realise succinct counters which cover a range exponentially larger than their size. In particu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1608.07310  شماره 

صفحات  -

تاریخ انتشار 2016